100 research outputs found

    To adapt or not to adapt? Technical debt and learning driven self-adaptation for managing runtime performance

    Get PDF
    Self-adaptive system (SAS) can adapt itself to optimize various key performance indicators in response to the dynamics and uncertainty in environment. In this paper, we present Debt Learning Driven Adaptation (DLDA), an framework that dynamically determines when and whether to adapt the SAS at runtime. DLDA leverages the temporal adaptation debt, a notion derived from the technical debt metaphor, to quantify the time-varying money that the SAS carries in relation to its performance and Service Level Agreements. We designed a temporal net debt driven labeling to label whether it is economically healthier to adapt the SAS (or not) in a circumstance, based on which an online machine learning classifier learns the correlation, and then predicts whether to adapt under the future circumstances. We conducted comprehensive experiments to evaluate DLDA with two different planners, using 5 online machine learning classifiers, and in comparison to 4 state-of-the-art debt- oblivious triggering approaches. The results reveal the effectiveness and superiority of DLDA according to different metrics

    The IMS New Researchers\u27 Survival Guide

    Get PDF
    Statistics is a wonderfully diverse profession and graduate students making career choices have many options — especially in light of the dearth of students moving into the statistical sciences today. The three main career paths at the PhD level are in academics, industry/business and government. Each of these job types offers its own mix of intellectual challenges, financial reward, pressure and security. How a new researcher selects (or is selected by) a specific occupation in the statistical sciences sometimes seems more a function of luck than of conscious decision making. This consideration was one of the first concerns addressed by the New Researchers Committee (NRC) of the Institute of Mathematical Statistics in 1988, and this guide is the product of that (and later) thinking. We believe that if students were better informed about their choices, they would be less apprehensive, pursue their goals more effectively and, ultimately, be far more likely to find positions for which they are well suited. Similarly, if doctoral students were generally more familiar with various aspects of professional life, the entire statistical community would benefit. Among the transitional facts of life with which we believe new researchers should be acquainted are: 1. mechanisms for applying for jobs, 2. expectations associated with different types of jobs, 3. techniques for initiating an active research program, and 4. methods of becoming more involved with the broader statistical community. The Survival Guide addresses these issues, but it also offers advice on a variety of other topics which new researchers may wish to consider as they prepare to leave graduate school. This guide is based on the Statistical Science article by the New Researchers Committee of IMS (1991). See Kruse (2002) on inspiration for statistics as a career path and Stasny (2001) on the big picture with respect to academic jobs. DeMets et al (1998) and Shettle and Gaddy (1998) provide job outlooks for statisticians

    Self-Modeling Regression with Random Effects Using Penalized Splines

    Full text link
    20 pages, 1 article*Self-Modeling Regression with Random Effects Using Penalized Splines* (Altman, Naomi S.; Villarreal, Julio C.) 20 page

    A Learning Based Framework for Improving Querying on Web Interfaces of Curated Knowledge Bases

    Get PDF
    Knowledge Bases (KBs) are widely used as one of the fundamental components in Semantic Web applications as they provide facts and relationships that can be automatically understood by machines. Curated knowledge bases usually use Resource Description Framework (RDF) as the data representation model. To query the RDF-presented knowledge in curated KBs, Web interfaces are built via SPARQL Endpoints. Currently, querying SPARQL Endpoints has problems like network instability and latency, which affect the query efficiency. To address these issues, we propose a client-side caching framework, SPARQL Endpoint Caching Framework (SECF), aiming at accelerating the overall querying speed over SPARQL Endpoints. SECF identifies the potential issued queries by leveraging the querying patterns learned from clients’ historical queries and prefecthes/caches these queries. In particular, we develop a distance function based on graph edit distance to measure the similarity of SPARQL queries. We propose a feature modelling method to transform SPARQL queries to vector representation that are fed into machine-learning algorithms. A time-aware smoothing-based method, Modified Simple Exponential Smoothing (MSES), is developed for cache replacement. Extensive experiments performed on real-world queries showcase the effectiveness of our approach, which outperforms the state-of-the-art work in terms of the overall querying speed

    The Amborella genome: an evolutionary reference for plant biology

    Get PDF
    The nuclear genome sequence of Amborella trichopoda, the sister species to all other extant angiosperms, will be an exceptional resource for plant genomics

    Comparison of next generation sequencing technologies for transcriptome characterization

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>We have developed a simulation approach to help determine the optimal mixture of sequencing methods for most complete and cost effective transcriptome sequencing. We compared simulation results for traditional capillary sequencing with "Next Generation" (NG) ultra high-throughput technologies. The simulation model was parameterized using mappings of 130,000 cDNA sequence reads to the <it>Arabidopsis </it>genome (NCBI Accession SRA008180.19). We also generated 454-GS20 sequences and <it>de novo </it>assemblies for the basal eudicot California poppy (<it>Eschscholzia californica</it>) and the magnoliid avocado (<it>Persea americana</it>) using a variety of methods for cDNA synthesis.</p> <p>Results</p> <p>The <it>Arabidopsis </it>reads tagged more than 15,000 genes, including new splice variants and extended UTR regions. Of the total 134,791 reads (13.8 MB), 119,518 (88.7%) mapped exactly to known exons, while 1,117 (0.8%) mapped to introns, 11,524 (8.6%) spanned annotated intron/exon boundaries, and 3,066 (2.3%) extended beyond the end of annotated UTRs. Sequence-based inference of relative gene expression levels correlated significantly with microarray data. As expected, NG sequencing of normalized libraries tagged more genes than non-normalized libraries, although non-normalized libraries yielded more full-length cDNA sequences. The <it>Arabidopsis </it>data were used to simulate additional rounds of NG and traditional EST sequencing, and various combinations of each. Our simulations suggest a combination of FLX and Solexa sequencing for optimal transcriptome coverage at modest cost. We have also developed ESTcalc <url>http://fgp.huck.psu.edu/NG_Sims/ngsim.pl</url>, an online webtool, which allows users to explore the results of this study by specifying individualized costs and sequencing characteristics.</p> <p>Conclusion</p> <p>NG sequencing technologies are a highly flexible set of platforms that can be scaled to suit different project goals. In terms of sequence coverage alone, the NG sequencing is a dramatic advance over capillary-based sequencing, but NG sequencing also presents significant challenges in assembly and sequence accuracy due to short read lengths, method-specific sequencing errors, and the absence of physical clones. These problems may be overcome by hybrid sequencing strategies using a mixture of sequencing methodologies, by new assemblers, and by sequencing more deeply. Sequencing and microarray outcomes from multiple experiments suggest that our simulator will be useful for guiding NG transcriptome sequencing projects in a wide range of organisms.</p

    DFSeer: A visual analytics approach to facilitate model selection for demand forecasting

    Get PDF
    Selecting an appropriate model to forecast product demand is critical to the manufacturing industry. However, due to the data complexity, market uncertainty and users' demanding requirements for the model, it is challenging for demand analysts to select a proper model. Although existing model selection methods can reduce the manual burden to some extent, they often fail to present model performance details on individual products and reveal the potential risk of the selected model. This paper presents DFSeer, an interactive visualization system to conduct reliable model selection for demand forecasting based on the products with similar historical demand. It supports model comparison and selection with different levels of details. Besides, it shows the difference in model performance on similar products to reveal the risk of model selection and increase users' confidence in choosing a forecasting model. Two case studies and interviews with domain experts demonstrate the effectiveness and usability of DFSeer.Comment: 10 pages, 5 figures, ACM CHI 202
    • …
    corecore